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Abstract 

In this study we introduce and analyze the statistical structural properties of a 
model of growing networks which may be relevant to social networks. At each step 
a new node is added which selects k possible partners from the existing network 
and joins them with probability S by undirected edges. The 'activity' of the node 
ends here; it will get new partners only if it is selected by a newcomer. The model 
produces an infinite-order phase transition when a giant component appears at a 
specific value of S, which depends on k. The average component size is discontin- 
uous at the transition. In contrast, the network behaves significantly different for 
k — 1. There is no giant component formed for any 5 and thus in this sense there 
is no phase transition. However, the average component size diverges for S > | . 



1 Introduction 



There are many kinds of networks including probably the most influential network of 
all, the World Wide Web 1 1 1. This network is a popular one to analyze because of its 
size and easy accessibility for statistical analysis. However, there are many other net- 
works that share some of the properties of the Web and some that do not. Among these 
networks we find social networks |j2][3]|4), collaboration nets [;5] |6| industrial 
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and business related networks |3]|6|[9), transportation nets 1 10 1 and many biological 
related nets such as food, ecological, and protein interaction networks 11 1111 21 fT3l 1 1 41 
and neural networks 1151 . 

The mathematical description of networks started with the fundamental works of Erdos 
and Renyi I I16II17I . which in the absence of reliable data on large networks were rarely 
compared to real networks. Recently, the computational boom has provided us an 
increasing number of types of networks and more data on these networks. One of 
the most exciting discoveries is the scale-free structures of certain evolving networks 
I18lll9ll20l . These nets have power law degree distribution, where only a few vertices 
have many connections to the others and the rest of the graph is rarely connected. 
To explain the origin of this scale free structure of networks Barabasi et al. 1211 1221 
suggested the mechanism of preferential attachment and emphasized the key role of 
growth. In their model the probability of a new node connecting to an existing node is 
proportional to the degree of the target node. Variations on this model include networks 
where there is aging of nodes, nonlinear attachment probabilities, and re-wiring are 
allowed. I23ll24ll25ll26l 

Probably the most obvious feature of real networks that is missing from most of the 
models studied by mathematicians and physicists are characteristics of individual nodes 
in real networks which influence the connection probability. Thus, if the nodes repre- 
sent individual persons, it is obvious that in many circumstances two people are more 
likely to become connected in some form of relationship because of the nature of their 
individual characteristics. Our model is motivated by the need to incorporate this idea. 
A similar idea was used in a preferential attachment model by Bianconi and Barabasi 
1271 who assigned to each new node a fitness parameter. In their model a larger fitness 
parameter may overcompensate the smaller probability of attachment. 

In our study we propose a simple model of growing networks whose statistical proper- 
ties are identical to a more complicated model containing nodes with distinct character- 
istics. We will calculate the edge distribution of the growing network, the distribution 
of cluster sizes and the emergence of a giant cluster. We will also show how the number 
of attempted connections made when a new node is added determines the position and 
type of the phase transition as well as the cluster size distribution. 



2 The Model 

We first consider a social network model where each node has individual characteristics 
or traits. Each node that is added to the network is assigned a permanent set of random 
traits which could be coded as an ordered binary string or vector of length L. When a 
node is added it chooses randomly k £ N possible partners from the already existing 
nodes, or if there are less then k + 1 (because the simulation has not yet reached time 
step k+2) it chooses all the existing nodes as possible partners. A trait distance between 
the new node and one of its possible partners is calculated based on their trait vectors 
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( t i, t 2) using a distance measure, D ^ t 1, t 2J, such as the Hamming distance. 
Then a connection is formed between the two nodes with a probability determined from 
a given probability distribution over the distance function p(D). Different functions, 
p{D), correspond to different soicopsychological situations. Thus, if we wish to model 
the case where people are more likely to link together if they have similar traits, then 
p(D) would be a monotonically decreasing function of D. For this case, the simplest 
p(D) would be to form a link if D is below some threshold. This procedure is repeated 
for each possible partner of the new node. Thus, each new node can have initially up 
to k links with the other existing nodes. Existing nodes can have more than k links as 
more nodes are added to the network and link up with the existing nodes. There are no 
multiple links between pairs of nodes. 

Because each node is given a random trait vector, and the nodes to link to are also 
chosen randomly, many properties of the network simply depend on the probability S, 
that two chosen nodes will link together: 

S = 5^p(D(ti, ~t 2 ))r(D(t u 7 2 )) (1) 

D 

where r (D) is the probability of the distance D between two nodes, and the sum is over 
all possible distance values. Thus, the model is reduced to the following procedure. At 
each time step we add a node to the network, and attempt to link with k existing nodes 
which are chosen at random. An actual connection is made with a probability 5. The 
asymptotic behavior of the network in the limit of large time t, does not depend on the 
initial condition of starting with a single isolated vertex. 

Although frequently structural properties of a network of nodes with trait vectors de- 
pends only on S, there are other properties which will depend on the detailed form 
of p(D) and the nature of the trait vectors. Examples of such properties include the 
distribution of traits in different parts of the network and the correlation of traits with 
distance in the network. For example, one can imagine a very simple network of nodes 
representing men and women. In one network the probability of forming a link is inde- 
pendent of sex, and in the other persons prefer to link up with members of the opposite 
sex. As long as the mean probability of two chosen nodes linking together is the same 
in the two scenarios the structural properties of the two networks will be the same, but 
the distribution of men and women within the network will be quite different in the two 
cases. In this paper we confine ourselves to the structural properties of networks and 
are considering these other non-structural properties in our current research. 
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3 Age dependence of the expected number of edges, and 
the edge distribution 

The expected number of edges at a node is approximately 



where N > k is the time-step when it was created, (the smaller N is, the older the node 
is) t is the total simulation time, 6 is the probability that two nodes form a connection, 
k is the maximum number of initial connections of a newly created node, and H n is 
the n th harmonic number given by the formula H n — Y%=i 1 f° r n > 0, and H n = 0. 
This equation shows that the number of edges of a node heavily depends on the age of 
the node. 

Equation (|2ji slightly overestimates the number of connections for the oldest nodes in 
the network in two respects. First, the above formula assumes that a node always has 
k possible initial connections. However, multiple connections between a pair of nodes 
are not allowed, and there are less than k available partners for the initial connections 
of a node created before or in the fc th time step (overestimation of initial connections). 
Second, the term for the late connections assumes that a node has a k/(m — 1) chance 
of being selected as the partner of the m th node (which chooses k possible partners out 
of to — 1 already existing nodes). However, for a node created in time step N < k, 
this term yields a probability of being chosen greater than 1 between time steps N + 1 
and k (where m — 1 < k) that is unacceptable again because multiple connections 
between a pair of nodes are not allowed (overestimation of late connections). Below is 
the formula correcting these errors, but will use the simpler, uncorrected formula in the 
remaining part of our paper because the errors are negligible. 



later connections 




(2) 



s=N+l 




Sk (1 + Ht-i-Hjf-x) 




conection for oldest nodes 




(3) 
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using H n ~ Inn + 7 + where 7 = — L 00 e x lnxda; ~ 0.5772 is the Euler- 
Mascheroni constant |28 1, and at = 1 + In (t — 1) + 2 (t-i) • 

Note that the first k + 1 nodes are expected to have the same number of connections 
(because Kn does not depend on N in their case), and the edge number starts breaking 
down exponentially for nodes created after time step k + 1 (Fig.[0and|2}l). This means 
that this growth mechanism is identical to that where the first k + 1 nodes are created 
in the same time step. 

We now wish to determine the edge distribution, P(X), equal to the probability that 
a node picked at random has on average X edges. We return to Eq. ignoring the 
correction term in Eq. (0, and write the formula for Kjsi{t) in the simpler form (Fig.^ 
and|2]l): 

Kn (t) * Sk (at -m(iV- 1) - ^ ^ , (4) 

where at is the same used in Eq. Q. Using Eq. @, neglecting the term 2 (n-i) ^ 
Eq. © for N large enough, and knowing that the age distribution of nodes is uniform, 
we analytically approximate the edge distribution of the network with the following 
exponential 

P{K(t)=X) = ±e-% +a <. (5) 

We used the standard transformation rule for random variables, P{N) = P(Kj^) |^r| 
with P(N) — l/t. For sufficiently large t, due to the definition of a t , this can be ef- 
fectively approximated by a distribution which is independent of t (Fig. [2 and 

P(X) = ^e-^ +1 . (6) 

We can also determine a slightly different degree or edge distribution which is the 
percentage of nodes with m edges. Denote by d m (t) the expected number of nodes 
with degree m at time t. The number of isolated nodes, do(t), will increase by (1 — S) k , 
which is the probability of the addition node not connecting to any existing node, and 
decrease on average by kSdo (t) jt : 

d (t + I) = d (t) + (1 - S) k -kS^-. (7) 

The formula for the expected number of nodes of degree m > is a bit complicated. 
For (1 < m < k) there are two ways to increase d m : either selecting degree m — 1 
nodes for connection with the new node or the new node having exactly m edges. For 
(m > k), the new node cannot contribute to d m . The decrease will be proportional to 
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Figure 1: (A) Expected number of edges of a node (Kn) as a function of the age of 
the node {N) at different ages of the network (i). Symbols are numerically calculated 
values from Eq. (0, showing that the first k + 1 nodes have the same number of con- 
nections at any t, whereas there is an exponential break down in the expected number 
of edges for nodes created later than these. Lines represent approximations by Eq. ©: 
the number of edges of the first k nodes are overestimated because the correction term 
in Eq. Q was ignored (see text). Note that the x-axis is logarithmic. (B) Edge distribu- 
tion of the network. Symbols represent numerically calculated distributions, where the 
numbers of edges of individual nodes were obtained from Eq. These numbers were 
binned into integer values and the relative frequencies of occurrences in each bin were 
plotted. The line represents the approximate distribution given by Eq. showing that 
it is valid for the edge distribution at large values of t (for t > 100). The mismatch be- 
tween approximated and actual distributions at the highest connection numbers is due 
the same reason as in (A). Note that y-axis is logarithmic. Parameters were 5 = 0.5, 
k = 5. 
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Figure 2: (A) Numerical simulation of the average expected number of edges of a node 
(Kn) as a function of the age of the node (N). Symbols are results of numerical simu- 
lations, line is the graph of Eq. @. Also see notes of Fig. 01. (B) Numerical simulation 
of the edge distribution of the network (symbols). Line is the graph of Eq. l|6). Average 
relative frequency of individual number of edges and std. were calculated. Deviation 
of simulation from analytical results at high number of edges is a result of the finite 
size of simulated networks due to dispersion of expected number of edges arround 
its expected value as shown in part A of this figure. As age of the network increases 
this deviation disappears and simulation results approach analytical approximation for 
longer interval. At low number of edges deviation is a results from neglecting the 
correction term in Eq. l|4} Averages and standard deviations were calculated from 100 
simulations. Parameters were: 6 = 0.5, k = 5 as in Fig. [2 
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the probability of choosing a degree m node for attachment. 

.d m -i{t) 



l<m<k: d m (t + l)= d m (t)+kS- 



t 



k m \ 6 m (l - 5) k -™ - kS^jQ (8) 
m>k: d m (t + l)= d m (t) + kd^j^-kS^^. (9) 

These equations are correct as t — > oo, and numerical simulations show that d m (t) ~ 
p m t. Substituting this form into the equations for d m (t) we obtain 



f k6 \ m ~ k 

m>k: p m = Pkij-^) ■ (ID 



This degree distribution p m decays exponentially consistent with our previous result 
forP(X). 



4 Critical behavior 



4.1 Cluster size distribution 

In some network models, such as the preferential attachment models, all the nodes 
belong to a single cluster. For such models the focus is on the degree distribution 
and the distance between nodes in the network. However, our network can contain 
a number of disconnected clusters of nodes. Then the key questions become what 
is the cluster size distribution and is there a phase transition between a collection of 
finite size clusters and the appearance of a giant cluster much larger than the rest. The 
transition is similar to that in percolation, with our parameter S playing the role of the 
site occupation probability in a percolation model. The key difference between our 
model and percolation models is that our nodes do not sit on a lattice structure, and 
there is thus no geometric constraints. The definition of a giant cluster in our model is 
somewhat different than a spanning cluster in percolation models. Nevertheless, some 
of the behavior is similar. 

Our model is similar to one by Calloway et al. |29| where an infinite order phase 
transition was found. In that model after a node was added to the network, two nodes 
were picked at random and connected with probability 8. Our model is more general 
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in that we consider the effect of making more than one link at any given time. Also, in 
our model the new links are between the added node and exisiting nodes, whereas in 
the model by Calloway et al the new links are between any two nodes in the network. 

To determine the cluster distribution we use a procedure similar to the one we used 
to calculate the degree distribution. The cluster number Nj(t) denotes the expected 
number of clusters of size j. On average, at each time step, (1 — S) k isolated nodes 
arrive at the network and kSN\(t)/t nodes will be chosen for attachment reducing Ni. 
Thus, Ni is described by 

N 1 (t + l)=N 1 (t) + (l-S) k ~kS^^. (12) 

For j > 1 new clusters of size j come from connecting the new node to a cluster of 
size j — 1 or if k > 1 using the new node to make connections between smaller clusters 
whose sizes add up to j. Reducing Nj will be jkSNj(t)/t nodes from clusters of size 
j connecting to the new node. Thus, we have 

W2 ( i + 1 )=^( i) + (j) i (l- { ,"M2-M^» (13) 



' min(fe ,j— 1) , 

N j (t + l)=N j (t)+[ ( r K(l-<^x 



E 



ziN Zl (t) z 2 N Z2 (t) (i - 1 - £l=i *i) ^(i-i-E- Zi ) (*) 



Zi + ...+Z r =j — 1 
Zi>l, i<r 



-kS>ML. (14) 

The first sum in Eq. J14i determines the number of sums in the next term. Each of 
these sums represent a cluster that is melted into the j sized cluster. These equations 
are valid for t — > oo, where the probability of closed loops tends to zero. The giant 
cluster, if there exists one, is an exception in which connection of nodes in loops is not 
negligible. Thus, Eq. HAl holds only for the finite sized clusters in the network. This 
property lets us determine a generating function which we can use to find the size of 
the giant cluster. Our simulations show that solutions of Eqs. d!2l >. Jl 3I > and H41 are of 
the steady state form Nj(t) = ajt. Using this form in Eqs. d!2l >. Jl 3i and H41 . we find 

- - fe£ 



(5(1 - Sf^cn 



fl2 = (16) 
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Figure 3: Cluster size distribution for different S-s and k = 1 (left), k — 2 (right). 
Solid, dashed and dotted lines are obtained from a least squares fit for the interval 
11 > ln(iVj) > -4 (A) and 20 > ln(JVj) > 2 (B) indicating the power-law behavior 
of the distributions. Simulation data were obtained by averaging over 500 runs of 10 7 
time-steps and are shown on a log-log plot. Note that in figure (B) simulations for 
S = 0.05 and 5 — 0.3 distributions do not follow a power-law. In Section l4~2l it is 
shown that there is a phase transition near 5 = 0.146. 




6 r {l-6) k -' r x 



e j - 1 - z> ^i-i-E- *o n ^ 

Z!+...+z r =j--L \ i=l J 1 = 1 

Zi>l, i<r J 



(17) 



Generally we cannot obtain a simpler equation for the cluster size distribution a,j, ex- 
cept for k = 1. Substituting k = 1 into the Eqs. H5\ . d!6l > and illl we obtain after 
some algebra the general result 

^ = (1-5)5^0-1)111^, (18) 

m— 1 

which can be written in the form: 

_ (l-6)T{\/5) r(i) 



5 2 r(j + i + 1/5) ' 



(19) 



where r(x) denotes the gamma-function. Eq. dl9l shows that the cluster size distri- 
bution for k = 1 always follows a power-law distribution. This result is confirmed by 
simulations shown in the left graph of Fig. [3] Distributions of cluster sizes for k = 2 
(right graph of Fig. 0, in contrast to k = 1 show power-law behavior only near the 
phase transition. 
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4.2 Position of the phase transition 



Fig. [4] shows the simulation results for S, the ratio of the average size of the largest 
cluster to the total number of nodes versus the connection probability 8. The figure 
suggests that there is a smooth transition in the appearance of S at a specific value of 5 
between 6 = and S = 0.2, which depends on the parameter k. To predict the position 
of a possible phase transition S c |29|, we will use a generating function for the cluster 
size distribution 1 30 1 . To derive the generating function we use the iterative Eqs. dl5> . 
d!6i . and illl . The generating function will be of the form: 

oo 

//(•'•; ( 20 > 

where 

bj=jdj, (21) 

is the probability that a randomly chosen node is from a cluster of size j. Multiplying 
both size of Eqs. dl5> . dl6> and < l 1 71 by jx?, and summing over j we derive a differential 
equation for g(x) 



g = -k6gf + x(l-S) k + J2( • W 



5) k ^{x 2 glg^H + xg l ). (22) 



Rearranging for gi we obtain 



(l-S^-g/x + ^lJ k .) ^(1 -«S)*V 
9 ' = 1 fk\ ' (23) 



which can be further simplified to 



-g/ X + (l + (g-l)6) k 

9 kS-xkSil + ig-lW*- 1 ' { ' 



The generating function for the finite size clusters is exactly one at x = 1 when there 
is no giant cluster in the network and g(l) < 1 otherwise. Hence 

S=l-g{\). (25) 

Without an analytic solution for Eqs. i24\ . we calculate S numerically by integrating 
Eqs. 1241 with the initial condition (x,g(x)) — (xq, xq(1 — S) k /(l + k5)) where 
xq is small. This is equivalent to starting with a cluster of only one node. In Fig. |4] 
there are results from direct simulations of the model (symbols) and solid lines from 
the integration of the generating function. The agreement is good which verifies the 
approximations. 
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Figure 4: Giant cluster size S as a function of 5 and k. Symbols are from simulations 
of the growing network for 10 6 time steps averaged over 30 runs. Lines are from the 
analytical calculations. 



To discuss the phase transition location we first consider the cases k > 1. Consider 
the expected value that a randomly chosen node belongs to a finite size cluster. We can 
determine this quantity in terms of the generating function g(x) 

(s) = (26) 

For those values of S where no giant cluster exists, 5 < S c , g(l) = 1, and both the 
numerator and denominator of Eq. 1241 goes to zero as x ~ * 1. Using L'Hopital's rule 
we derive a quadratic equation for 3/(1). The solution of this equation is 



_ l-2k6±y/(2k6-l)*-4k(k-lW ni , 
9{) ~ 2fc(fc-l)J2 ' (Z/) 

for g(l) — 1. Because as S — > all clusters will have size 1, one can show that the 
correct solution of Eq. |27i is the one with the negative sign. In addition from Eq. i27\ 
we can find the location of the phase transition. It is the value of 6 where the solution 
of Eal27lbecomes complex: 



{ «=Mp2. (28) 

In the region where there is a giant cluster S > S c , Eq. 1241 becomes as x —> 1, 

-. 9 +(l + (,g-l)^) fc 

3 kS - kS(l + (g- l)S) k ~ l ' ' 
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which is still not solvable analytically. Making the approximation (l±a) fe « 1 ± ka 
when a <C 1 , we can simplify Eq. d29i close to 5 C : 

where 5(1) < 1,6 > 5 C , and (5(1) — 1)5 <C 1. In Fig.[3]we show the simulation results 
and the above derived theoretical functions for g/(l). We can see that for 5 < 5 C , where 
we have an explicit expression for 5/(1) in terms of the parameters k and 5 the fit is 
very good. For 5 > 5 C the fit is good close to the phase transition point, where the 
approximation (g — 1)5 <C 1 holds. Although below 5 C the description of gf(l) is very 
good, it seems that the location of the phase transition and the value of the function 
gi(l) above 5 C is somewhat different than the data. Also if we carefully check Fig. [5] 
at the jumps, we find that the larger the jump the less accurate the theory seems to be. 
This can be explained as follows. At the critical point the average size of finite clusters 
jumps, hence much larger clusters appear in the network. As we can only simulate for a 
finite time large (but not the giant) clusters are underrepresented. The weights of them 
computed from the simulation data are less then they would be in an infinitely long 
simulation. Away from the transition regime fewer finite size clusters remain beside 
the giant cluster in the network, and thus the distribution can be specified better. 

Although the formalism using the generating function can be done for k = 1, the 
meaning of a giant cluster is problematic. In Section |4~TI we showed that the size- 
distribution of clusters for k = 1 always follows a power-law which means there is no 
obvious border between the 'giant' cluster and smaller clusters. There is not a sharp 
break between the largest and the next largest cluster. The physical reason for this is 
that clusters grow only by the addition of newly added nodes. This is different than 
the case for k > 1 and in percolation models where clusters can also grow by a link 
combining two clusters. In this sense no giant cluster appears in the network except for 
5 = 1. Eq. J24t becomes 

./ n (l-6)-g/x + 5g 

9l{x) = — — ' (31) 

which becomes § in the limit x — * 1 with g(l) = 1. Applying L'Hopital's rule yields 

9>{1) = ]~Z~2£' ™ 

At 8 = 5, gf(l) — >oo, which means the average size of finite clusters approaches infin- 
ity. From the definition of g(x) in Eq. ( I20> and the power-law cluster size distribution 
for dj, it follows that g(l) = 1 for any 5 ^ 1. To see that g/(l)— > 00 as x — > 1 for 
5 > |, we consider the sum form of the generating function in Eq. J20i . For large j, 
j ~ j(i+ys) E<T 1191 . and gi = Yl'fLi j^ a j x ^ 1 i which can not be summed for 5 > |. 



a 



When 5 < |, the probability of a new node not joining a cluster is higher then joining, 
and thus the weight of small clusters is higher than that of larger clusters, and hence 
the average size remains finite. As 5 — > h, the probability of forming clusters increases 
and so do the weight of large clusters. 



13 



+ 


k=2 


o 


k=3 


X 


k=5 



X 







0.1 0.2 0.3 0.4 
5 connection probability 



0.5 



Figure 5: Discontinuity in gf(l) for different values of k. Solid lines are theoretical, 
and symbols are results from the simulations of growing networks for 10 6 time steps, 
averaged over 30 runs. 



4.3 Infinite-order transition 



To show the nature of our phase transitions 1291 . we numerically integrated Eq. J24> 
for different values of k near the corresponding critical S c . In Fig.|6]the linear parts of 
the log(-log(S)) plots suggest that 

S{6) ~ e a( - s - s ^ as S -> S c , (33) 
and because all derivatives of S vanish at S c , the transition is of infinite order. 

Table^contains the parameters of the fitted straight lines in Fig.|6] As the calculations 
were done close to the numerical limit and referring to the similar results in |29| we 
conjecture that (3 equals — i for all k. This result suggests that the mechanism of 
the transition is common and the number of possible partners for each node to link to 
determines the speed of emergence of the giant cluster S. These results are in accord 
with Eq. (1301 . the average cluster size decrease is approximately independent of k, but 
the size of the jump and the rate of decrease is driven by k. 
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Figure 6: Numerical calculation of the giant cluster size close to but above the phase 



transition. Least-squares fitted solid straight lines suggest S(S) 
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flat ends of the curves on the top appear due to the limit of the accuracy of numerical 
integration. 
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20 


25 


30 


40 


50 


a 


-0.25 


-0.5 


-0.75 


-1.14 


-1.35 


-1.52 


-1.64 


-1.77 


-1.9 


-2.02 


p 


-0.577 


-0.569 


-0.557 


-0.554 


-0.551 


-0.552 


-0.551 


-0.554 


-0.551 


-0.55 



Table 1: The parameter values (a and (3) of the fitted lines in Fig. [6] Taking into 
account that we were at the border of the maximal numerical accuracy and that the fit 
is short we presume (3 — —\. 
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5 Discussion 



The present model was intended to gain insight into the evolution of various social net- 
works by considering mechanisms that account for heterogeneity in the population of 
participating entities. To analyse the statistical properties of the generated network we 
simplified the model. We found that the structure of the network dramatically changes 
when the number of possible links to a newly added node increases from k = 1 to 
k = 2. With k = 1 the network does not form a giant cluster but the average cluster 
size goes to infinity (at 8 = |) in contrast to k > 2, where the giant cluster appears 
in an infinite-order phase transition and the average cluster size jumps discontinuously 
but remains finite. The size of the jump corresponds to how slowly the giant clus- 
ter overcomes the other competitive large clusters. However, there is no transition for 
k = 1, where none of the clusters can absorb other clusters. The distribution of the size 
of finite clusters always follows an exponential distribution, both below and above the 
critical point for k > 1, while the model studied in 13 II 1291 is in a critial state below 
and at the critical point and exhibits an exponential distribution of cluster size above the 
transition as in a Berezinskii-Kosterlitz-Thouless phase transition. Thus, even though 
there are disconnected clusters as in our model, there are significant differences in the 
behavior of the cluster size distribution. 

Our model is similar to a previous model of Callaway et al. 1 29 1, but there are essential 
differences in several points due to nature of the growth algorithm: in the model of 
Callaway et al. network growth and connection formation are independent while in 
our model only newly added nodes form connections. Also, in our model multiple 
connections might be formed in one time step depending on parameters k and 5. This 
difference is well reflected in the generating function derived for the two models. 

The structural properties of our model are more relevant to many social networks than 
other growth models such as preferential attachment because the degree distribution 
is exponential which is closer to real social systems and because there are clusters of 
nodes which represents the reality of social systems where people usually form various 
communities which are relatively isolated from each other. As long as the distribution 
of nodal traits are random, then the structural properties which we have discussed in 
this paper do not depend on the nature of the traits and thus our network model should 
be relevant to any social network. The next step is to analyze the distribution of traits 
on a social network. This will vary depending on how the attachment rule depends on 
the values of these traits even though the structural properties of the network remains 
the same. We will discuss the distribution of traits on a network in a future publication. 



6 Conclusions 

We introduced a model of growing social networks and analyzed its statistical prop- 
erties. Our analytical calculations showed that these growing networks exhibit expo- 
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nential degree distributions. We gave an explicit description of the expected number of 
edges which showed an exponential dependence on the age of a node. We also showed 
that emergence of a giant cluster and the cluster-size distribution strongly depend on 
the number of possible initial partners. Numerical simulations suggested that the gen- 
erated networks have scale free cluster distributions only at the phase transition point. 
In all other regions of the phase space the cluster distribution was exponential. In the 
absence of an exact solution for Eq. i24\ . we showed numerical results suggesting that 
the order of the phase transition is infinite, which is similar to the results found by 1 29 1. 
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